219 research outputs found
KAVUAKA: a low-power application-specific processor architecture for digital hearing aids
The power consumption of digital hearing aids is very restricted due to their small physical size and the available hardware resources for signal processing are limited. However, there is a demand for more processing performance to make future hearing aids more useful and smarter. Future hearing aids should be able to detect, localize, and recognize target speakers in complex acoustic environments to further improve the speech intelligibility of the individual hearing aid user. Computationally intensive algorithms are required for this task. To maintain acceptable battery life, the hearing aid processing architecture must be highly optimized for extremely low-power consumption and high processing performance.The integration of application-specific instruction-set processors (ASIPs) into hearing aids enables a wide range of architectural customizations to meet the stringent power consumption and performance requirements. In this thesis, the application-specific hearing aid processor KAVUAKA is presented, which is customized and optimized with state-of-the-art hearing aid algorithms such as speaker localization, noise reduction, beamforming algorithms, and speech recognition. Specialized and application-specific instructions are designed and added to the baseline instruction set architecture (ISA). Among the major contributions are a multiply-accumulate (MAC) unit for real- and complex-valued numbers, architectures for power reduction during register accesses, co-processors and a low-latency audio interface. With the proposed MAC architecture, the KAVUAKA processor requires 16 % less cycles for the computation of a 128-point fast Fourier transform (FFT) compared to related programmable digital signal processors. The power consumption during register file accesses is decreased by 6 %to 17 % with isolation and by-pass techniques. The hardware-induced audio latency is 34 %lower compared to related audio interfaces for frame size of 64 samples.The final hearing aid system-on-chip (SoC) with four KAVUAKA processor cores and ten co-processors is integrated as an application-specific integrated circuit (ASIC) using a 40 nm low-power technology. The die size is 3.6 mm2. Each of the processors and co-processors contains individual customizations and hardware features with a varying datapath width between 24-bit to 64-bit. The core area of the 64-bit processor configuration is 0.134 mm2. The processors are organized in two clusters that share memory, an audio interface, co-processors and serial interfaces. The average power consumption at a clock speed of 10 MHz is 2.4 mW for SoC and 0.6 mW for the 64-bit processor.Case studies with four reference hearing aid algorithms are used to present and evaluate the proposed hardware architectures and optimizations. The program code for each processor and co-processor is generated and optimized with evolutionary algorithms for operation merging,instruction scheduling and register allocation. The KAVUAKA processor architecture is com-pared to related processor architectures in terms of processing performance, average power consumption, and silicon area requirements
Do Repeat Yourself: Understanding Sufficient Conditions for Restricted Chase Non-Termination
The disjunctive restricted chase is a sound and complete procedure for
solving boolean conjunctive query entailment over knowledge bases of
disjunctive existential rules. Alas, this procedure does not always terminate
and checking if it does is undecidable. However, we can use acyclicity notions
(sufficient conditions that imply termination) to effectively apply the chase
in many real-world cases. To know if these conditions are as general as
possible, we can use cyclicity notions (sufficient conditions that imply
non-termination). In this paper, we discuss some issues with previously
existing cyclicity notions, propose some novel notions for non-termination by
dismantling the original idea, and empirically verify the generality of the new
criteria
A Survey on Application Specific Processor Architectures for Digital Hearing Aids
On the one hand, processors for hearing aids are highly specialized for audio processing, on the other hand they have to meet challenging hardware restrictions. This paper aims to provide an overview of the requirements, architectures, and implementations of these processors. Special attention is given to the increasingly common application-specific instruction-set processors (ASIPs). The main focus of this paper lies on hardware-related aspects such as the processor architecture, the interfaces, the application specific integrated circuit (ASIC) technology, and the operating conditions. The different hearing aid implementations are compared in terms of power consumption, silicon area, and computing performance for the algorithms used. Challenges for the design of future hearing aid processors are discussed based on current trends and developments
Indirect Meltdown: Building Novel Side-Channel Attacks from Transient-Execution Attacks
The transient-execution attack Meltdown leaks sensitive information by
transiently accessing inaccessible data during out-of-order execution. Although
Meltdown is fixed in hardware for recent CPU generations, most
currently-deployed CPUs have to rely on software mitigations, such as KPTI.
Still, Meltdown is considered non-exploitable on current systems. In this
paper, we show that adding another layer of indirection to Meltdown transforms
a transient-execution attack into a side-channel attack, leaking metadata
instead of data. We show that despite software mitigations, attackers can still
leak metadata from other security domains by observing the success rate of
Meltdown on non-secret data. With LeakIDT, we present the first cache-line
granular monitoring of kernel addresses. LeakIDT allows an attacker to obtain
cycle-accurate timestamps for attacker-chosen interrupts. We use our attack to
get accurate inter-keystroke timings and fingerprint visited websites. While we
propose a low-overhead software mitigation to prevent the exploitation of
LeakIDT, we emphasize that the side-channel aspect of transient-execution
attacks should not be underestimated.Comment: published at ESORICS 202
Reviving Meltdown 3a
Since the initial discovery of Meltdown and Spectre in 2017, different
variants of these attacks have been discovered. One often overlooked variant is
Meltdown 3a, also known as Meltdown-CPL-REG. Even though Meltdown-CPL-REG was
initially discovered in 2018, the available information regarding the
vulnerability is still sparse. In this paper, we analyze Meltdown-CPL-REG on 19
different CPUs from different vendors using an automated tool. We observe that
the impact is more diverse than documented and differs from CPU to CPU.
Surprisingly, while the newest Intel CPUs do not seem affected by
Meltdown-CPL-REG, the newest available AMD CPUs (Zen3+) are still affected by
the vulnerability. Furthermore, given our attack primitive CounterLeak, we show
that besides up-to-date patches, Meltdown-CPL-REG can still be exploited as we
reenable performance-counter-based attacks on cryptographic algorithms, break
KASLR, and mount Spectre attacks. Although Meltdown-CPL-REG is not as powerful
as other transient-execution attacks, its attack surface should not be
underestimated.Comment: published at ESORICS 202
Hammulator: Simulate Now - Exploit Later
Rowhammer, first considered a reliability issue, turned out to be a significant threat to the security of systems. Hence, several mitigation techniques have been proposed to prevent the exploitation of the Rowhammer effect. Consequently, attackers developed more sophisticated hammering and exploitation techniques to circumvent mitigations. Still, the development and testing of Rowhammer exploits can be a tedious process, taking multiple hours to get the bit flip at the correct location.
In this paper, we propose Hammulator, an open-source rapid-prototyping framework for Rowhammer exploits. We simulate the Rowhammer effect using the gem5 simulator and DRAMsim3 model, with a parameterizable implementation that allows researchers to simulate various types of systems. Hammulator enables faster and more deterministic bit flips, facilitating the development of Rowhammer proof-of-concept exploits and defenses. We evaluate our simulator by reproducing 2 open-source Rowhammer exploits. We also evaluate 2 previously proposed mitigations, PARA and TRR, in our simulator. Additionally, our micro- and macrobenchmarks show that our simulator has a small average overhead in the range of 6.96 % to 10.21 %. Our results show that Hammulator can be used to compare Rowhammer exploits objectively by providing a consistent testing environment. Hammulator and all experiments and evaluations are open source, hoping to ease the research on Rowhammer
Wormhole Effects on Yang-Mills Theory
In this paper wormhole effects on YM theory are examined. The
wormhole wave functions for the scalar, the vector and the tensor expansion
modes are computed assuming a small gauge coupling which leads to an effective
decoupling of gravity and YM theory. These results are used to determine the
wormhole vertices and the corresponding effective operators for the lowest
expansion mode of each type. For the lowest scalar mode we find a
renormalization of the gauge coupling from the two point function and the
operators \tr (F^3), \tr (F^2\tilde{F}) from the three point function. The
two point function for the lowest vector mode contributes to the gauge coupling
renormalization only whereas the lowest tensor mode can also generate higher
derivative terms.Comment: 15 pages, TUM--TH--165/9
KAVUAKA : Chip Design für digitale Hörhilfen
Am Institut für Mikroelektronische Systeme (IMS) wird im Rahmen des Exzellenzclusters Hearing4all erforscht, wie Signalverarbeitung-Chips für digitale Hörgerätesystemen anhand von komplexen Hörgerätealgorithmen konzipiert und optimiert werden können. Ziel der Forschung ist es, neuartige Prozessorarchitekturen zu entwickeln, die die geforderte hohe Rechenleistung bereitstellen, gleichzeitig einen sehr geringen Stromverbrauch aufweisen und in kleine Hörgerätegehäuse integriert werden können
A Security RISC: Microarchitectural Attacks on Hardware RISC-V CPUs
Microarchitectural attacks threaten the security of computer systems even in the absence of software vulnerabilities. Such attacks are well explored on x86 and ARM CPUs, with a wide range of proposed but not-yet deployed hardware countermeasures. With the standardization of the RISC-V instruction set architecture and the announcement of support for the architecture by major processor vendors, RISC-V CPUs are on the verge of becoming ubiquitous. However, the microarchitectural attack surface of the first commercially available RISC-V hardware CPUs is not yet explored. This paper analyzes the two commercially-available off-the-shelf 64-bit RISC-V (hardware) CPUs used in most RISC-V systems running a full-fledged commodity Linux system. We evaluate the microarchitectural attack surface, which leads to the introduction of 3 new microarchitectural attack techniques: Cache+Time, a novel cache-line-granular cache attack without shared memory, Flush+Fault exploiting the Harvard cache architecture for Flush+Reload, and CycleDrift exploiting unprivileged access to instruction-retirement information. Additionally, we show that many known attacks are applicable to these RISC-V CPUs, mainly due to non-existing hardware countermeasures and instruction-set subtleties that do not consider the microarchitectural attack surface. We demonstrate our attacks in 6 case studies, including the first RISC-V-specific microarchitectural KASLR break and a CycleDrift-based method for detecting kernel activity. Based on our analysis, we stress the need to consider the microarchitectural attack surface during every step of a CPU design, including custom instruction-set extensions
Collide+Power: Leaking Inaccessible Data with Software-based Power Side Channels
Differential Power Analysis (DPA) measures single-bit differences between data values used in computer systems by statistical analysis of power traces. In this paper, we show that the mere co-location of data values, e.g., attacker and victim data in the same buffers and caches, leads to power leakage in modern CPUs that depends on a combination of both values, resulting in a novel attack, Collide+Power. We systematically analyze the power leakage of the CPU's memory hierarchy to derive precise leakage models enabling practical end-to-end attacks. These attacks can be conducted in software with any signal related to power consumption, e.g., power consumption interfaces or throttling-induced timing variations. Leakage due to throttling requires 133.3 times more samples than direct power measurements. We develop a novel differential measurement technique amplifying the exploitable leakage by a factor of 8.778 on average, compared to a straightforward DPA approach. We demonstrate that Collide+Power leaks single-bit differences from the CPU's memory hierarchy with fewer than 23000 measurements. Collide+Power varies attacker-controlled data in our end-to-end DPA attacks. We present a Meltdown-style attack, leaking from attacker-chosen memory locations, and a faster MDS-style attack, which leaks 4.82 bit/h. Collide+Power is a generic attack applicable to any modern CPU, arbitrary memory locations, and victim applications and data. However, the Meltdown-style attack is not yet practical, as it is limited by the state of the art of prefetching victim data into the cache, leading to an unrealistic real-world attack runtime with throttling of more than a year for a single bit. Given the different variants and potentially more practical prefetching methods, we consider Collide+Power a relevant threat that is challenging to mitigate
- …